Draft
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR adds end-to-end vector store support to DataPusher Plus, embedding resource data via a local SentenceTransformer model and querying via ChromaDB and OpenRouter.
- Introduces a new
DataPusherVectorStoreclass for embedding, querying, and managing vector data. - Integrates vector embedding into the upload job pipeline with optional temporal coverage extraction.
- Adds configuration settings and a helper to check embedding status.
Reviewed Changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| ckanext/datapusher_plus/vector_store.py | New module implementing vector store integration |
| ckanext/datapusher_plus/jobs.py | Hooks embedding into the datapusher job pipeline |
| ckanext/datapusher_plus/helpers.py | Adds helper to query embedding status |
| ckanext/datapusher_plus/config.py | Adds configuration flags and defaults for vector store |
Comments suppressed due to low confidence (2)
ckanext/datapusher_plus/jobs.py:1599
- The new vector store embedding workflow in the job pipeline lacks corresponding unit or integration tests. Consider adding tests to cover
DataPusherVectorStore.embed_resourceand the job integration path.
if conf.ENABLE_VECTOR_STORE and VECTOR_STORE_AVAILABLE:
ckanext/datapusher_plus/jobs.py:1638
- The function
parsedateis not imported in this module, causing a NameError at runtime. Add the appropriate import (e.g.,from dateutil.parser import parse as parsedate).
min_year = parsedate(str(min_date)).year
| "ckanext.datapusher_plus.embedding_device", "cpu" | ||
| ) | ||
| # OpenRouter API Key | ||
| OPENROUTER_API_KEY = tk.config.get( |
There was a problem hiding this comment.
A default OpenRouter API key is hard-coded in source. This poses a security risk; consider loading it exclusively from a secure environment variable.
ef4f36d to
62962b5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.